Add Safe Scientific Development System for Claude Code #10

edeno · 2025-10-23T22:49:41Z

Summary

Implements a three-layer defense system to ensure safe, regression-free scientific development with Claude Code:

Layer 1 (Hooks): Automatic enforcement of environment and testing requirements
Layer 2 (Skills): Workflow guidance for TDD, numerical validation, and safe refactoring
Layer 3 (Documentation): Enhanced CLAUDE.md with operational rules and decision trees

Key Features

✅ Environment Consistency: Conda environment checking for all Python commands
✅ Regression Prevention: Snapshot change detection with approval gates, test-before-commit reminders
✅ Numerical Accuracy: Tolerance specifications (1e-14 for refactoring, 1e-10 for algorithms), mathematical invariant verification
✅ Guided Autonomy: Claude runs tests/validates automatically, asks permission for commits/snapshot updates
✅ Pragmatic TDD: Test-first for new features, test-verify for simple bugs
✅ JAX Integration: Special handling for JAX code optimization and validation

Components Delivered

Hooks (4 files)

pre-tool-use.sh - Environment validation, test reminders, snapshot protection
user-prompt-submit.sh - Snapshot change detection
lib/env_check.sh - Conda environment utilities
lib/numerical_validation.sh - Validation and approval utilities

Skills (3 files)

scientific-tdd - Pragmatic test-driven development workflow
numerical-validation - Comprehensive numerical correctness verification
safe-refactoring - Zero-tolerance behavior-preserving refactoring

Documentation (5 files)

Enhanced CLAUDE.md with critical operational rules, numerical standards, workflow guide
SAFE_DEVELOPMENT_GUIDE.md - Comprehensive user guide
SYSTEM_VERIFICATION.md - Complete verification report
Skills and Hooks README files

Testing (1 file)

test_safe_dev_system.sh - Integration test (24/24 checks passing)

Test Plan

All 24 integration tests passing
Hooks execute correctly (environment check, snapshot protection)
Skills have proper XML headers
Documentation complete and accurate
Git worktrees properly ignored
No regressions in existing functionality

Verification

Run the integration test:

./tests/test_safe_dev_system.sh

Expected: All tests pass (24/24)

Usage

After merging, Claude Code will automatically:

Follow operational rules from enhanced CLAUDE.md
Use appropriate skills for different task types
Enforce environment requirements via hooks
Require approval for snapshot updates and commits

See docs/SAFE_DEVELOPMENT_GUIDE.md for complete usage instructions.

🤖 Generated with Claude Code

- env_check.sh: Conda environment detection and validation - numerical_validation.sh: Snapshot and invariant checking utilities - Fix .gitignore to allow .claude/hooks/lib/ (was blocked by /lib/ pattern) Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Warns when Python commands run outside conda environment - Reminds to run tests before commits - Blocks snapshot updates without approval Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The find command on line 42 was looking for files with a "*.pytest_cache" pattern, which doesn't exist. The .pytest_cache directory contains files with various names (like .gitignore, CACHEDIR.TAG, README.md, v/cache/, etc.), not files ending in .pytest_cache. Changed from: find .pytest_cache -name "*.pytest_cache" -mmin -5 To: find .pytest_cache -type f -mmin -5 This correctly finds any files in the .pytest_cache directory that were modified within the last 5 minutes, properly detecting recent test runs. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Detects when snapshot files have changed - Reminds Claude to provide full analysis before updates Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Pragmatic TDD workflow for scientific code with numerical validation. Part of safe scientific development system.

Comprehensive numerical validation workflow with tolerance guidelines. Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Zero-tolerance workflow for behavior-preserving refactoring. Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…and workflow guide Add three critical sections to CLAUDE.md for safe scientific development: **Task 7 - Critical Operational Rules:** - Mandatory skills usage (scientific-tdd, numerical-validation, safe-refactoring, jax) - Environment enforcement rules (conda activation required) - Guided autonomy boundaries (what Claude can do vs. must ask permission) - Snapshot update approval process with required 4-part analysis format **Task 8 - Numerical Accuracy Standards:** - When numerical validation is required (which files/components) - Tolerance specifications table (1e-14 for refactoring, 1e-10 for algorithms) - Mathematical invariants that must always hold (probabilities, stochastic matrices, etc.) - Validation commands (property tests, golden regression, snapshots) **Task 9 - Workflow Selection Guide:** - Decision tree for selecting appropriate workflow/skill - Task-based guidance (new features, bugs, refactoring, JAX, etc.) - JAX code requirements and best practices - JAX-specific validation checklist These enhancements provide Claude with clear operational guidelines, numerical accuracy requirements, and workflow selection criteria for scientific development. Before: 132 lines After: 357 lines Added: 225 lines Part of safe scientific development system implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Task 10: Create .claude/skills/README.md - Overview of all 3 skills (scientific-tdd, numerical-validation, safe-refactoring) - Usage guidance for each skill - Workflow summaries - Integration and maintenance information Task 11: Create .claude/hooks/README.md - Overview of both hooks (pre-tool-use.sh, user-prompt-submit.sh) - Utility library documentation (env_check.sh, numerical_validation.sh) - Testing procedures and expected behavior - Debugging guidance - Integration with skills explanation Both READMEs provide clear documentation of the safe scientific development system components. Part of safe scientific development system implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Task 12: Integration test script (tests/test_safe_dev_system.sh) - Tests all 8 categories: hook utilities, hooks, skills, CLAUDE.md, functional tests, snapshot protection, git config, documentation - Validates complete system installation and functionality - All tests passing Task 13: Comprehensive user guide (docs/SAFE_DEVELOPMENT_GUIDE.md) - Quick start for users and Claude - Three-layer system explanation - Common workflow examples with checkpoints - Approval gate processes with checklists - Numerical tolerance guidelines - Troubleshooting guide - Customization instructions - Best practices and FAQ Both are final documentation/testing deliverables for the safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Complete verification of all components, functional testing results, and maintenance procedures. Final deliverable for safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Add proper XML metadata headers to scientific-tdd skill - Add proper XML metadata headers to numerical-validation skill - Add proper XML metadata headers to safe-refactoring skill Headers follow Claude Code skill format with name, description, tags, and version. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Copilot

Pull Request Overview

This PR implements a comprehensive three-layer defense system to ensure safe, regression-free scientific development with Claude Code. The system combines automatic enforcement through hooks, workflow guidance through skills, and enhanced documentation to prevent numerical regressions and maintain code quality.

Key changes include:

Layer 1: Hooks that automatically enforce environment requirements, test reminders, and snapshot protection
Layer 2: Three skills (scientific-tdd, numerical-validation, safe-refactoring) providing structured workflows for different development scenarios
Layer 3: Enhanced CLAUDE.md with operational rules, numerical standards, and workflow decision trees

Reviewed Changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/test_safe_dev_system.sh	Integration test suite validating all system components
docs/SYSTEM_VERIFICATION.md	Comprehensive verification report documenting system status and capabilities
docs/SAFE_DEVELOPMENT_GUIDE.md	User guide explaining workflows, approval gates, and troubleshooting
CLAUDE.md	Enhanced with critical operational rules, numerical standards, and workflow selection guide
.claude/skills/scientific-tdd/skill.md	Pragmatic TDD workflow for new features with numerical validation
.claude/skills/safe-refactoring/skill.md	Zero-tolerance refactoring workflow ensuring exact behavioral match
.claude/skills/numerical-validation/skill.md	Comprehensive numerical correctness verification workflow
.claude/skills/README.md	Overview of available skills and their usage patterns
.claude/hooks/user-prompt-submit.sh	Post-prompt hook detecting snapshot changes
.claude/hooks/pre-tool-use.sh	Pre-execution hook enforcing environment and snapshot requirements
.claude/hooks/lib/numerical_validation.sh	Utilities for snapshot detection and approval management
.claude/hooks/lib/env_check.sh	Utilities for conda environment validation
.claude/hooks/README.md	Comprehensive hook documentation and debugging guide

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

- Fix import sorting (ruff I001) - Remove unused variable assignment (ruff F841) - Remove trailing whitespace (ruff W291) - Remove whitespace from blank lines (ruff W293) - Remove unused import (ruff F401) All auto-fixed with ruff check --fix and ruff format. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Black and ruff have slightly different formatting preferences. Applying black formatting to match CI requirements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Remove black formatting check from CI workflow - Use ruff format exclusively for code formatting - Reformat all files with ruff format Ruff provides equivalent formatting to black with additional linting capabilities, simplifying the toolchain. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Previous commit included the workflow file but the edit didn't apply correctly. This commit properly removes the black formatting check step. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

edeno and others added 12 commits October 23, 2025 17:59

feat: add scientific-tdd skill for test-driven development

16687ba

Pragmatic TDD workflow for scientific code with numerical validation. Part of safe scientific development system.

feat: add safe-refactoring skill for structure changes

4afc3af

Zero-tolerance workflow for behavior-preserving refactoring. Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

edeno requested a review from Copilot October 23, 2025 22:50

Copilot AI reviewed Oct 23, 2025

View reviewed changes

edeno and others added 4 commits October 23, 2025 18:53

fix: apply black formatting for CI compliance

96de5c5

Black and ruff have slightly different formatting preferences. Applying black formatting to match CI requirements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

edeno merged commit e930700 into main Oct 24, 2025
4 of 7 checks passed

edeno deleted the feature/safe-scientific-dev branch October 24, 2025 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Safe Scientific Development System for Claude Code #10

Add Safe Scientific Development System for Claude Code #10

Uh oh!

edeno commented Oct 23, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Safe Scientific Development System for Claude Code #10

Add Safe Scientific Development System for Claude Code #10

Uh oh!

Conversation

edeno commented Oct 23, 2025

Summary

Key Features

Components Delivered

Hooks (4 files)

Skills (3 files)

Documentation (5 files)

Testing (1 file)

Test Plan

Verification

Usage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants